Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher.
Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?
Some links on this page may take you to non-federal websites. Their policies may differ from this site.
-
The DocEng’19 Competition on Extractive Text Summarization assessed the performance of two new and fourteen previously published extractive text sumarization methods. The competitors were evaluated using the CNN-Corpus, the largest test set available today for single document extractive summarization.more » « less
-
This paper details the features and the methodology adopted in the construction of the CNN-corpus, a test corpus for single document extractive text summarization of news articles. The current version of the CNN-corpus encompasses 3,000 texts in English, and each of them has an abstractive and an extractive summary. The corpus allows quantitative and qualitative assessments of extractive summarization strategies.more » « less
-
This paper details the development and features of the CNN-corpus in Spanish, possibly the largest test corpus for single document extractive text summarization in the Spanish language. Its current version encompasses 1,117 well-written texts in Spanish, each of them has an abstractive and an extractive summary. The development methodology adopted allows good-quality qualitative and quantitative assessments of summarization strategies for tools developed in the Spanish language.more » « less